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SEQUENCE LISTINi 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Donson, Jon 

Dawson, William 0. 
Grantham, George L. 
Turpen, Thomas H. 
Turpen, Ann Myers 
Garger, Stephen J. 
Grill, Laurence K. 

(ii) TITLE OF INVENTION: RECOMBINANT PLANT VIRAL 

NUCLEIC ACIDS 

(iii) NUMBER OF SEQUENCES: 11 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Limbach & Limbach 

(B) STREET: 2001 Ferry Building 

(C) CITY: San Francisco 

(D) STATE: CAL 
(F) ZIP: 94111 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: Patent in Release #1.0, 
Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 
(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 600,244 

(B) FILING DATE: 22 -OCT- 199 0 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 641,617 

(B) FILING DATE: 16 -JAN- 1991 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 310,881 

(B) FILING DATE: 17 -FEB- 1989 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 160,766 

(B) FILING DATE: 26 -FEB- 1988 



(vii) 



PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 160,771 

(B) FILING DATE: 26 -FEB- 1988 



k 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 347,637 

(B) FILING DATE: 05 -MAY- 19 89 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 363,138 

(B) FILING DATE: 08-JUN-1989 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 219,279 

(B) FILING DATE: 15- JUL- 1988 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Halluin, Albert P. 

(B) REGISTRATION NUMBER: 2 8,957 

(C) REFERENCE /DOCKET NUMBER: BIOG-2 0121 USA 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 415-433-4150 

(B) TELEFAX: 415-433-8716 



(2) INFORMATION FOR SEQ ID NO : 1: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Pro Xaa Gly Pro 

1 

(2) INFORMATION FOR SEQ ID NO : 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICTUij : NO 

(iv) ANTI -SENSE: NO 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



GGGTACCTGG GCC 



13 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 886 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

( i i i ) HYPOTHETICAL : NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Chinese cuciamber 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: alpha- trichosanthin 

(ix) FEATURE: 

(A) NAME/KEY: CDS (B) LOCATION: 8. .877 

(B) LOCATION: 8. .877 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CTCGAGG ATG ATC AGA TTC TTA GTC CTC TOT TTG CTA ATT CTC AGO CTdS 

Met lie Arg Phe Leu Val Leu Ser Leu Leu lie Leu Thr Leu 
15 10 

TTC CTA ACA ACT CCT GCT GTG GAG GGC GAT GTT AGC TTC CGT TTA TCffl"? 

Phe Leu Thr Thr Pro Ala Val Glu Gly Asp Val Ser Phe Arg Leu Ser 
15 20 25 30 

GGT GCA ACA AGC AGT TCC TAT GGA GTT TTC ATT TCA AAT CTG AGA ASMS 

Gly Ala Thr Ser Ser Ser Tyr Gly Val Phe lie Ser Asn Leu Arg Lys 
35 40 45 

GCT CTT CCA AAT GAA AGG AAA CTG TAC GAT ATC CCT CTG TTA CGT TOO 3 

Ala Leu Pro Asn Glu Arg Lys Leu Tyr Asp lie Pro Leu Leu Arg Ser 
50 55 60 

TCT CTT CCA GGT TCT CAA CGC TAC GCA TTG ATC CAT CTC ACA AAT T/B31 

Ser Leu Pro Gly Ser Gin Arg Tyr Ala Leu lie His Leu Thr Asn Tyr 
65 70 75 

GCC GAT GAA ACC ATT TCA GTG GCC ATA GAC GTA ACG AAC GTC TAT ATZBS 



Ala Asp Glu Thr 
80 

ATG GGA TAT CGC 

Met Gly Tyr Arg 
95 

GCA ACA GAA GCT 
Ala Thr Glu Ala 

ACG CTT CCA TAT 

Thr Leu Pro Tyr 
130 

AAA ATA AGG GAA 

Lys lie Arg Glu 
145 

ATT ACC ACT TTG 

lie Thr Thr Leu 
160 

ATG GTA CTC ATT 

Met Val Leu lie 
175 

GAG CAA CAA ATT 
Glu Gin Gin lie 

GCA ATT ATA AGT 

Ala lie lie Ser 
210 

CAG ATA GCG AGT 

Gin lie Ala Ser 
225 

ATA AAT GCT CAA 

lie Asn Ala Gin 
240 

GTT GTA ACC TCC 



lie Ser Val Ala 
85 

GCT GGC GAT ACA 

Ala Gly Asp Thr 
100 

GCA AAA TAT GTA 

Ala Lys Tyr Val 
115 

TCT GGC AAT TAG 
Ser Gly Asn Tyr 

AAT ATT CCG CTT 

Asn lie Pro Leu 
150 

TTT TAC TAC AAC 

Phe Tyr Tyr Asn 
165 

CAG TCG ACG TCT 

Gin Ser Thr Ser 
180 

GGG AAG CGC GTT 

Gly Lys Arg Val 
195 

TTG GAA AAT AGT 
Leu Glu Asn Ser 

ACT AAT AAT GGA 

Thr Asn Asn Gly 
230 

J\AC CAA CGA GTC 

Asn Gin Arg Val 
245 

AAC ATC GCG TTG 



lie Asp Val Thr 
90 

TCC TAT TTT TTC 

Ser Tyr Phe Phe 
105 

TTC AAA GAC GCT 

Phe Lys Asp Ala 
120 

GAA AGG CTT CAA 

Glu Arg Leu Gin 
135 

GGA CTC CCA GCT 
Gly Leu Pro Ala 

GCC AAT TCT GCT 

Ala Asn Ser Ala 
170 

GAG GCT GCG AGG 

Glu Ala Ala Arg 
185 

GAC AAA ACC TTC 

Asp Lys Thr Phe 
200 

TGG TCT GCT CTC 

Trp Ser Ala Leu 
215 

CAG TTT GAA ACT 
Gin Phe Glu Thr 

ATG ATA ACC AAT 

Met lie Thr Asn 
250 

CTG CTG AAT CGA 



Asn Val Tyr lie 

AAC GAG GCT TaB7 

Asn Glu Ala Ser 
110 

ATG CGA AAA GISISS 

Met Arg Lys Val 
125 

ACT GCT GCG 0(303 

Thr Ala Ala Gly 
140 

TTG GAC AGT GCSCBl 

Leu Asp Ser Ala 
155 

GCG TCG GCA CEZS 
Ala Ser Ala Leu 

TAT AAA TTT A'HI77 

Tyr Lys Phe lie 
190 

CTA CCA AGT TEES 

Leu Pro Ser Leu 
205 

TCC AAG CAA A1Hr73 

Ser Lys Gin lie 
220 

CCT GTT GTG CTTEl 

Pro Val Val Leu 
235 

GTT GAT GCT GGBEg 
Val Asp Ala Gly 

AAC AAT ATG GOBI 7 



Val Val Thr Ser Asn lie Ala Leu Leu Leu Asn Arg Asn Asn Met Ala 
255 260 265 270 



GCC ATG GAT GAG GAT GTT CCT ATG ACA GAG AGC TTT GGA TGT GGA ACBB5 

Ala Met Asp Asp Asp Val Pro Met Thr Gin Ser Phe Gly Cys Gly Ser 
275 280 285 

TAT GCT ATT TAGTAACTCG AG 886 

Tyr Ala lie 

290 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 89 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



Met lie Arg Phe Leu Val Leu Ser Leu Leu lie Leu Thr Leu Phe Leu 
15 10 15 

Thr Thr Pro Ala Val Glu Gly Asp Val Ser Phe Arg Leu Ser Gly Ala 
20 25 30 

Thr Ser Ser Ser Tyr Gly Val Phe lie Ser Asn Leu Arg Lys Ala Leu 
35 40 45 

Pro Asn Glu Arg Lys Leu Tyr Asp lie Pro Leu Leu Arg Ser Ser Leu 
50 55 60 

Pro Gly Ser Gin Arg Tyr Ala Leu lie His Leu Thr Asn Tyr Ala Asp 
65 70 75 80 

Glu Thr lie Ser Val Ala lie Asp Val Thr Asn Val Tyr lie Met Gly 
85 90 95 

Tyr Arg Ala Gly Asp Thr Ser Tyr Phe Phe Asn Glu Ala Ser Ala Thr 
100 105 110 

Glu Ala Ala Lys Tyr Val Phe Lys Asp Ala Met Arg Lys Val Thr Leu 
115 120 125 

Pro Tyr Ser Gly Asn Tyr Glu Arg Leu Gin Thr Ala Ala Gly Lys lie 
130 135 140 



Arg Glu Asn lie Pro Leu Gly Leu Pro Ala Leu Asp Ser Ala lie Thr 
145 150 155 160 



Thr Leu Phe Tyr 



Leu lie Gin Ser 
180 

Gin lie Gly Lys 
195 

lie Ser Leu Glu 
210 

Ala Ser Thr Asn 
225 

Ala Gin Asn Gin 



Thr Ser Asn lie 
260 

Asp Asp Asp Val 
275 

He 



Tyr Asn Ala Asn 
1S5 

Thr Ser Glu Ala 



Arg Val Asp Lys 
200 

Asn Ser Trp Ser 
215 

Asn Gly Gin Phe 
230 

Arg Val Met lie 
245 

Ala Leu Leu Leu 



Pro Met Thr Gin 
280 



Ser Ala Ala Ser 
170 

Ala Arg Tyr Lys 
185 

Thr Phe Leu Pro 



Ala Leu Ser Lys 
220 

Glu Thr Pro Val 
235 

Thr Asn Val Asp 
250 

Asn Arg Asn Asn 
265 

Ser Phe Gly Cys 



Ala Leu Met Val 
175 

Phe He Glu Gin 
190 

Ser Leu Ala He 
205 

Gin He Gin He 



Val Leu He Asn 
240 

Ala Gly Val Val 
255 

Met Ala Ala Met 
270 

Gly Ser Tyr Ala 
285 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1452 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGTU^ISM: Oryza sativa 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: alpha- amylase 

(ix) FEATURE: 

(A) NAME/KEY: CDS (B) LOCATION: 12. .1316 

(B) LOCATION: 12. .1316 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CCTCGAGGTG C ATG GAG GTG CTG AAC ACC ATG GTG AAC A CAC TTC TTG 50 
Met Gin Val Leu Asn Thr Met Val Asn Lys His Phe Leu 



15 10 

TCC CTT TCG GTC CTC ATC GTC CTC CTT GGC CTC TCC TCC AAC TTG ACfflS 

Ser Leu Ser Val Leu lie Val Leu Leu Gly Leu Ser Ser Asn Leu Thr 
15 20 25 

GCC GGG CAA GTC CTG TTT CAG GGA TTC AAC TGG GAG TCG TGG AAG Gfla6 

Ala Gly Gin Val Leu Phe Gin Gly Phe Asn Trp Glu Ser Trp Lys Glu 
30 35 40 45 

AAT GGC GGG TGG TAC AAC TTC CTG ATG GGC AAG GTG GAC GAC ATC nnma. 

Asn Gly Gly Trp Tyr Asn Phe Leu Met Gly Lys Val Asp Asp He Ala 
50 55 60 

GCA GCC GGC ATC ACC CAC GTC TGG CTC CCT CCG CCG TCT CAC TCT GT332 

Ala Ala Gly He Thr His Val Trp Leu Pro Pro Pro Ser His Ser Val 
65 70 75 

GGC GAG CAA GGC TAC ATG CCT GGG CGG CTG TAC GAT CTG GAC GCG TC2B0 

Gly Glu Gin Gly Tyr Met Pro Gly Arg Leu Tyr Asp Leu Asp Ala Ser 
80 85 90 

AAG TAC GGC AAC GAG GCG CAG CTC AAG TCG CTG ATC GAG GCG TTC CSB8 

Lys Tyr Gly Asn Glu Ala Gin Leu Lys Ser Leu He Glu Ala Phe His 
95 100 105 

GGC AAG GGC GTC CAG GTG ATC GCC GAC ATC GTC ATC AAC CAC CGC AODS 

Gly Lys Gly Val Gin Val He Ala Asp He Val He Asn His Arg Thr 
110 115 120 125 

GCG GAG CAC AAG GAC GGC CGC GGC ATC TAC TGC CTC TTC GAG GGC GC3S4 

Ala Glu His Lys Asp Gly Arg Gly He Tyr Cys Leu Phe Glu Gly Gly 
130 135 140 

ACG CCC GAC TCC CGC CTC GAC TGG GGC CCG CAC ATG ATC TGC CGC GMCS2 

Thr Pro Asp Ser Arg Leu Asp Trp Gly Pro His Met He Cys Arg Asp 
145 150 155 

GAC CCC TAC GGC CAT GGC ACC GGC AAC CCG GAC ACC GGC GCC GAC TS30 

Asp Pro Tyr Gly Asp Gly Thr Gly Asn Pro Asp Thr Gly Ala Asp Phe 
160 165 170 

GCC GCC GCG CCG GAC ATC GAC CAC CTC AAC AAG CGC GTC CAG CGG GISSB 

Ala Ala Ala Pro Asp He Asp His Leu Asn Lys Arg Val Gin Arg Glu 
175 180 185 



CTC ATT GGC TGG 

Leu lie Gly Trp 
190 

TGG CGC CTC GAC 

Trp Arg Leu Asp 

TAG ATC GAC GCC 

Tyr lie Asp Ala 
225 

TCC ATG GCG AAC 

Ser Met Ala Asn 
240 

CAC CGG GAG GAG 

His Arg Gin Glu 
255 

ACC AAC GGC ACG 

Ser Asn Gly Thr 
270 

GCC GTG GAG GGC 
Ala Val Glu Gly 

CCC GGC ATG ATC 

Pro Gly Met He 
305 

AAC CAC GAC ACC 

Asn His Asp Thr 
320 

AAG GTC ATG CAG 

Lys Val Met Gin 
335 

TGC ATC TTG TAG 

Cys He Phe Tyr 
350 



CTC GAC TGG CTC 

Leu Asp Trp Leu 
195 

TTC GCC AAG GGC 

Phe Ala Lys Gly 
210 

ACC GAG CCG AGC 
Thr Glu Pro Ser 

GGC GGG GAC GGC 

Gly Gly Asp Gly 
245 

CTG GTC AAC TGG 

Leu Val Asn Trp 
260 

GCG TTC GAC TTC 

Ala Phe Asp Phe 
275 

GAG CTG TGG CGC 

Glu Leu Trp Arg 
290 

GGG TGC TGG CCG 
Gly Trp Trp Pro 

GGC TCG ACG CAG 

Gly Ser Thr Gin 
325 

GGC TAC GCA TAC 

Gly Tyr Ala Tyr 
340 

GAC CAT TTC TTC 

Asp His Phe Phe 
355 



AAG ATG GAC ATC 

Lys Met Asp lie 
200 

TAC TCC GCC GAC 

Tyr Ser Ala Asp 
215 

TTC GCC GTG CCC 

Phe Ala Val Ala 
230 

AAG CCG AAC TAC 
Lys Pro Asn Tyr 

GTC GAT CGT GTC 

Val Asp Arg Val 
265 

ACC ACC AAG GGC 

Thr Thr Lys Gly 
280 

CTC CGC GGC GAG 

Leu Arg Gly Glu 
295 

GCC AAG GCG ACG 

Ala Lys Ala Thr 
310 

CAC CTG TGG CCG 
His Leu Trp Pro 

ATC CTC ACC CAC 

He Leu Thr His 
345 

GAT TGG GGT CTC 

Asp Trp Gly Leu 
360 



GGC TTC GAC GCH26 

Gly Phe Asp Ala 
205 

ATG GCA AAC ATB:74 

Met Ala Lys He 
220 

GAG ATA TCG AOS 2 

Glu He Trp Thr 
235 

GAC CAG AAC GCCO 

Asp Gin Asn Ala 
250 

GGC GGC GCC RNCIB 
Gly Gly Ala Asn 

ATC CTC AAC GTSE6 

He Leu Asn Val 
285 

GAC GGC AAG G(314 

Asp Gly Lys Ala 
300 

ACC TTC GTC GiSCE2 

Thr Phe Val Asp 
315 

TTC CCC TCC OAnO 

Phe Pro Ser Asp 
330 

CCC GGC AAC 00358 
Pro Gly Asn Pro 

AAG GAG GAG imX>6 

Lys Glu Glu He 
365 



m 



GAG CGC CTG GTG TCA ATC AGA AAC CGG CAG GGG ATC CAC CCG GCG ISGSSi 

Glu Arg Leu Val Ser lie Arg Asn Arg Gin Gly lie His Pro Ala Ser 
370 375 380 

GAG CTG CGC ATC ATG GAA GCT GAC AGC GAT CTC TAG CTC GCG GAG 7I133D2 

Glu Leu Arg lie Met Glu Ala Asp Ser Asp Leu Tyr Leu Ala Glu lie 
385 390 395 

GAT GGC AAG GTG ATC ACA AAG ATT GGA CCA AGA TAC GAC GTC GAA CKK50 

Asp Gly Lys Val lie Thr Lys lie Gly Pro Arg Tyr Asp Val Glu His 
400 405 410 

CTC ATC CCC GAA GGC TTC CAG GTC GTC GCG CAC GGT GAT GGC TAC a£3S8 

Leu lie Pro Glu Gly Phe Gin Val Val Ala His Gly Asp Gly Tyr Ala 
415 420 425 

ATC TGG GAG AAA ATC TGAGCGCACG ATGACGAGAC TCTCAGTTTA GCAGA' l"i"llftft) 3 

lie Trp Glu Lys Lie 

430 435 

CCTGCGATTT TTACCCTGAC CGGTATACGT ATATACGTGC CGGCAACGAG 
CTGTATCCGA 1413 



TCCGAATTAC GGATGCAATT GTCCACGAAG TCCTCGAGG 1452 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 434 amino acids 

(B) TYPE: amino acid 
(D) Topology: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6: 

Met Gin Val Leu Asn Thr Met Val Asn Lys His Phe Leu Ser Leu Ser 
15 10 15 

Val Leu lie Val Leu Leu Gly Leu Ser Ser Asn Leu Thr Ala Gly Gin 
20 25 30 

Val Leu Phe Gin Gly Phe Asn Trp Glu Ser Trp Lys Glu Asn Gly Gly 
35 40 45 



Trp Tyr Asn Phe Leu Met Gly Lys Val Asp Asp lie Ala Ala Ala Gly 
50 55 60 



lie Thr His Val Trp Leu Pro Pro Pro Ser His Ser Val Gly Glu Gin 
65 70 75 80 

Gly Tyr Met Pro Gly Arg Leu Tyr Asp Leu Asp Ala Ser Lys Tyr Gly 
85 90 95 

Asn Glu Ala Gin Leu Lys Ser Leu lie Glu Ala Phe His Gly Lys Gly 
100 105 110 

Val Gin Val lie Ala Asp lie Val lie Asn His Arg Thr Ala Glu His 
115 120 125 

Lys Asp Gly Arg Gly lie Tyr Cys Leu Phe Glu Gly Gly Thr Pro Asp 
130 135 140 

Ser Arg Leu Asp Trp Gly Pro His Met lie Cys Arg Asp Asp Pro Tyr 
145 150 155 160 

Gly Asp Gly Thr Gly Asn Pro Asp Thr Gly Ala Asp Phe Ala Ala Ala 
165 170 175 

Pro Asp lie Asp His Leu Asn Lys Arg Val Gin Arg Glu Leu lie Gly 
180 185 190 

Trp Leu Asp Trp Leu Lys Met Asp He Gly Phe Asp Ala Trp Arg Leu 
195 200 205 

Asp Phe Ala Lys Gly Tyr Ser Ala Asp Met Ala Lys He Tyr He Asp 
210 215 220 

Ala Thr Glu Pro Ser Phe Ala Val Ala Glu He Trp Thr Ser Met Ala 
225 230 235 240 

Asn Gly Gly Asp Gly Lys Pro Asn Tyr Asp Gin Asn Ala His Arg Gin 
245 250 255 

Glu Leu Val Asn Trp Val Asp Arg Val Gly Gly Ala Asn Ser Asn Gly 
260 265 270 

Thr Ala Phe Asp Phe Thr Thr Lys Gly He Leu Asn Val Ala Val Glu 
275 280 285 

Gly Glu Leu Trp Arg Leu Arg Gly Glu Asp Gly Lys Ala Pro Gly Met 
290 295 300 

He Gly Trp Trp Pro Ala Lys Ala Thr Thr Phe Val Asp Asn His Asp 
305 310 315 320 

Thr Gly Ser Thr Gin His Leu Trp Pro Phe Pro Ser Asp Lys Val Met 
325 330 335 

Gin Gly Tyr Ala Tyr He Leu Thr His Pro Gly Asn Pro Cys He Phe 
340 345 350 



Tyr Asp His Phe Phe Asp Trp Gly Leu Lys Glu Glu He Glu Arg Leu 
355 360 365 



Val Ser lie Arg 
370 

lie Met Glu Ala 
385 

Val lie Thr Lys 



Glu Gly Phe Gin 
420 

Lys lie 



Asn Arg Gin Gly 
375 

Asp Ser Asp Leu 
390 

lie Gly Pro Arg 
405 

Val Val Ala His 



lie His Pro Ala 
380 

Tyr Leu Ala Glu 
395 

Tyr Asp Val Glu 
410 

Gly Asp Gly Tyr 
425 



Ser Glu Leu Arg 



lie Asp Gly Lys 
400 

His Leu lie Pro 
415 

Ala lie Trp Glu 
430 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 709 base pairs 

(B) TYPE: nucleic acid 
(G) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: alpha -hemoglobin 

(ix) FEATURE: 

(A) NAME/KEY: trans it_peptide (B) LOCATION: 
26. .241 

(B) LOCATION: 26. .241 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 245. .670 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CTCGAGGGCA TCTGATCTTT CAAGAATGGC ACAAATTAAC AACATGGCAC 
AAGGGATACA 60 

AACCCTTAAT CCCAATTCCA ATTTCCATAA ACCCCAAGTT CCTAAATCTT 
CAAGTTTTCT 120 



TGTTTTTGGA TGTAAAAAAC TGAAAATTC AGCAAATTCT ATGTTGGTTT TGAAAAAflGBO 



TTCAATTTTT ATGCAAAAGT TTTGTTCCTT TAGGATTTCA GCAGGTGGTA 
GAGTTTCTTG 240 

CATG GTG CTG TCT CCT GCC GAC AAG ACC AAC GTC AAG GCC GCC TGG GGC 

289 

Val Leu Ser Pro Ala Asp Lys Thr Asn Val Lys Ala Ala Trp Cly 
15 10 15 

AAG GTT GGC GCG CAC GCT GGC GAG TAT GGT GCG GAG GCC CTG GAG A(337 

Lys Val Gly Ala His Ala Gly Glu Tyr Gly Ala Glu Ala Leu Glu Arg 
20 25 30 

ATG TTC CTG TCC TTC CCC ACC ACC AAG ACC TAG TTC CCG CAC TTC GiaB5 

Met Phe Leu Ser Phe Pro Thr Thr Lys Thr Tyr Phe Pro His Phe Asp 
35 40 45 

CTG AGC CAC GGC TCT GCC CAG GTT AAG GGC CAC GGC AAG AAG GTG GC3Q3 

Leu Ser His Gly Ser Ala Gin Val Lys Gly His Gly Lys Lys Val Ala 
SO 55 60 

GAC GCG CTG ACC AAC GCC GTG GCG CAC GTG GAC GAC ATG CCC AAC GOGBl 

Asp Ala Leu Thr Asn Ala Val Ala His Val Asp Asp Met Pro Asn Ala 
65 70 75 

CTG TCC GCC CTG AGC GAC CTG CAC GCG CAC AAG CTT CGG GTG GAC C(S29 

Leu Ser Ala Leu Ser Asp Leu His Ala His Lys Leu Arg Val Asp Pro 
80 85 90 95 

GTC AAC TTC AAG CTC CTA AGC CAC TGC CTG CTG GTG ACC CTG GCC GC3:77 

Val Asn Phe Lys Leu Leu Ser His Cys Leu Leu Val Thr Leu Ala Ala 
100 105 110 

CAC CTC CCC GCC GAG TTC ACC CCT GCG GTG CAC GCC TCC CTG GAC MSZS 

His Leu Pro Ala Glu Phe Thr Pro Ala Val His Ala Ser Leu Asp Lys 
115 120 125 

TTC CTG GCT TCT GTG AGC ACC GTG CTG ACC TCC AAA TAC CGT 
TAAGCTGGAG 677 

Phe Leu Ala Ser Val Ser Thr Val Leu Thr Ser Lys Tyr Arg 
130 135 140 



CCTCGGTAGC CGTTCCTCCT GCCCGGTCGA CC 



(2) INFORMATION FOR SEQ ID NO: 8 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 141 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Val Leu Ser Pro Ala Asp Lys Thr Asn Val Lys Ala Ala Trp Gly Lys 
15 10 15 

Val Gly Ala His Ala Gly Glu Tyr Gly Ala Glu Ala Leu Glu Arg Met 
20 25 30 

Phe Leu Ser Phe Pro Thr Thr Lys Thr Tyr Phe Pro His Phe Asp Leu 
35 40 45 

Ser His Gly Ser Ala Gin Val Lys Gly His Gly Lys Lys Val Ala Asp 
50 55 60 

Ala Leu Thr Asn Ala Val Ala His Val Asp Asp Met Pro Asn Ala Leu 
65 70 75 80 

Ser Ala Leu Ser Asp Leu His Ala His Lys Leu Arg Val Asp Pro Val 
85 90 95 

Asn Phe Lys Leu Leu Ser His Cys Leu Leu Val Thr Leu Ala Ala His 
100 105 110 

Leu Pro Ala Glu Phe Thr Pro Ala Val His Ala Ser Leu Asp Lys Phe 
115 120 125 

Leu Ala Ser Val Ser Thr Val Leu Thr Ser Lys Tyr Arg 
130 135 140 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 743 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(vii) IMMEDIATE SOURCE: 



(B) CLONE: beta -hemoglobin 

(ix) FEATURE: 

(A) NAME/KEY: transit_peptide (B) LOCATION: 
26. .241 

(B) LOCATION: 26.. 241 

( ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 245.. 685 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9: 

CTCGAGGGGA TCTGATCTTT CAAGAATGGC ACAAATTAAC AACATGGCAC 
AAGGGATACA 60 

AACCCTTAAT CCCAATTCCA ATTTCCATAA ACCCCAAGTT CCTAAATCTT 
CAAGTTTTCT 12 0 

TGTTTTTGGA TCTAAAAAAC TGAAAAATTC AGCAAATTCT ATGTTGGTTT 
TGAAAAAAGA 180 

TTCAATTTTT ATGCAAAAGT TTTGTTCCTT TAGGATTTCA GCAGGTGGTA 
GAGTTTCTTG 24 0 

GATG GTG CAC CTG ACT CCT GAG GAG AAG TCT GCC GTT ACT GCC CTG TGG 

289 

Val His Leu Thr Pro Glu Glu Lys Ser Ala Val Thr Ala Leu Trp 
15 10 15 

GGC AAG GTG AAC GTG GAT GAA GTT GGT GGT GAG GCC CTG GGC AGG C1S3 7 

Gly Lys Val Asn Val Asp Glu Val Gly Gly Glu Ala Leu Gly Arg Leu 
20 25 30 

CTG GTG GTC TAC CCT TGG ACC CAG AGG TTC TTT GAG TCC TTT GGG GS1S5 

Leu Val Val Tyr Pro Trp Thr Gin Arg Phe Phe Glu Ser Phe Gly Asp 
35 40 45 

CTG TCC ACT CCT GAT GCT GTT ATG GGC AAC CCT AAG GTG AAG GCT CMS 3 

Leu Ser Thr Pro Asp Ala Val Met Gly Asn Pro Lys Val Lys Ala His 
50 55 60 

GGC AAG AAA GTG CTG GGT GCC TTT AGT GAT GGC CTG GCT CAC CTG GTtfBl 

Gly Lys Lys Val Leu Gly Ala Phe Ser Asp Gly Leu Ala His Leu Asp 
65 70 75 

AAC CTC AAG GGC ACC TTT GCC ACCA CTG AGT GAG CTG CAC TGT GAC AAG 

529 

Asn Leu Lys Gly Thr Phe Ala Thr Leu Ser Glu Leu His Cys Asp Lys 
80 85 90 95 



CTG CAC GTG GAT CCT GAG AGC TTC AGG CTC CTA GGC AAC GTG CTG GWl 

Leu His Val Asp Pro Glu Ser Phe Arg Leu Leu Gly Asn Val Leu Val 
100 105 110 

TGT GTG CTG GCG CAT CAC TTT GGC AAA GAA TTC ACC CCA CCA GTG C^825 

Cys Val Leu Ala His His Phe Gly Lys Glu Phe Thr Pro Pro Val Gin 
115 120 125 

GCT GCC TAT CAG AAA GTG GTG GCT GGT GTG GCT AAT GCC CTG GCC 0373 

Ala Ala Tyr Gin Lys Val Val Ala Gly Val Ala Asn Ala Leu Ala His 
130 135 140 

AAG TAT CAC TAAGCTCGCT TTCTTGCTGT CCAATTTCTA TTATIAGGTTC 722 

Lys Tyr His 
145 

CTTTGTGGGG TCGAGGTCGA C 743 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Val His Leu Thr Pro Glu Glu Lys Ser Ala Val Thr Ala Leu Trp Gly 
15 10 15 

Lys Val Asn Val Asp Glu Val Gly Gly Glu Ala Leu Gly Arg Leu Leu 
20 25 30 

Val Val Tyr Pro Trp Thr Gin Arg Phe Phe Glu Ser Phe Gly Asp Leu 
35 40 45 

Ser Thr Pro Asp Ala Val Met Gly Asn Pro Lys Val Lys Ala His Gly 
50 55 60 

Lys Lys Val Leu Gly Ala Phe Ser Asp Gly Leu Ala His Leu Asp Asn 
65 70 75 80 

Leu Lys Gly Thr Phe Ala Thr Leu Ser Glu Leu His Cys Asp Lys Leu 
85 90 95 

His Val Asp Pro Glu Ser Phe Arg Leu Leu Gly Asn Val Leu Val Cys 
100 105 110 



Val Leu Ala His His Phe Gly Lys Glu Phe Thr Pro Pro Val Gin Ala 
115 120 125 

Ala Tyr Gin Lys Val Val Ala Gly Val Ala Asn Ala Leu Ala His Lys 
130 135 140 

Tyr His 
145 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: N- terminal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: alkalophilic Bacillus sp. 

(B) STRAIN: 38-2 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: beta- cyclodextrin 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: II: 

Ala Pro Asp Thr Ser Val Ser Asn Lys Gin Asn Phe Ser Thr Asp Val 
15 10 15 



He 



